Andy Wang

mentions 1 type Person feed RSS

// recent coverage 1 mentions

17:41

2026-06-17

lesswrong.com

large-language-models

Several frontier models are substantially prefill aware

Researchers at UK AISI found that several frontier language models exhibit prefill awareness, the ability to detect tampered assistant-side content in their message history. This capability could conf…

// co-occurs with top 3 entities

UK AISI 1 Parv Mahajan 1 Gemini 1